Mixing times are hitting times of large sets 
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Abstract 



We consider irreducible reversible discrete time Markov chains on a finite state space. Mixing 
times and hitting times are fundamental parameters of the chain. We relate them by showing 
that the mixing time of the lazy chain is equivalent to the maximum over initial states x and 
large sets A of the hitting time of A starting from x. We also prove that the first time when 
Ph averaging over two consecutive time steps is close to stationarity is equivalent to the mixing 

Oh ' time of the lazy version of the chain. 
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1 Introduction 

m 

Mixing times and hitting times are among the most fundamental notions associated with a finite 
Markov chain. A variety of tools have been developed to estimate both these notions; in particular, 
00 . hitting times are closely related to potential theory and they can be determined by solving a system 

of linear equations. In this paper we establish a new connection between mixing times and hitting 
times for reversible Markov chains (Theorem II. 

Let (Xt)t>o be an irreducible Markov chain on a finite state space with transition matrix P and 
^ i stationary distribution ir. For x, y in the state space we write 

P t (x,y)=F x (X t = y), 

for the transition probability in t steps. 

Let d(t) = max ||P*(rr, •) — vr II , where — u\\ stands for the total variation distance between the 

X 

two probability measures [i and v. Let e > 0. The total variation mixing is defined as follows: 

imixO) = min{t > : d(t) < e}. 

We write P l L for the transition probability in t steps of the lazy version of the chain, i.e. the chain 
with transition matrix ^r-- If we now let di{t) = max •) — 7r||, then we can define the mixing 



2 

time of the lazy chain as follows: 



t L (e) = min{i > : d L (t) < s}. (1.1) 
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For notational convenience we will simply write ty, and t m j x when e = 1/4. 

Before stating our first theorem, we introduce the maximum hitting time of "big" sets. Let a < 1/2, 
then we define 

t H (a) = max E x [t a ], 

x,A:w(A)>a 

where ta stands for the first hitting time of the set A. 

It is clear (and we prove it later) that if the Markov chain has not hit a big set, then it cannot have 
mixed. Thus for every a > 0, there is a positive constant c' a so that 

th > c a tn(a). 

In the following theorem, we show that the converse is also true when a chain is reversible. 

Theorem 1.1. Let a < 1/2. Then there exist positive constants c' a and c a so that for every 
reversible chain 

c a tn(a) <t L < c a t u (a). 

Remark 1.2. Aldous in [2] showed that the mixing time, t cts , of a continuous time reversible chain 
is equivalent to t pro d = max 7t(^4)E x [t/i]. The inequality t pro d < ci^ctsj f° r a positive constant 

x,A:tt(A)>0 

ci, which was the hard part in Aldous' proof, follows from Theorem 11.11 and the equivalence £l ~ ^cts 
(see [5j Theorem 20.3]). For the other direction we give a new proof in Section El 

Remark 1.3. In Section [9] we present an application of Theorem II .11 to robustness of the mixing 
time. Namely, we show that for a finite binary tree, assigning bounded conductances to the edges 
can only change the mixing time of the lazy random walk on the tree by a bounded factor. 



To avoid periodicity and near-periodicity issues, one often considers the lazy version of a discrete 
time Markov chain. In the following theorem we show that averaging over two successive times 
suffices, i.e. £l ~ *avc (4) where 

f P t (x,-) + P t+1 (x,-) 
^ave(e) = mm < t > : max tt < e 



For notational convenience we will simply write £ ave when e = 1 /4. 

Theorem 1.4. There exist universal positive constants c and d so that for every reversible Markov 
chain 

cth < t avc < c't L . 



The problem of relating i avc to the mixing time t cts of the continuous-time chain was raised in 
Aldous-Fill PQ, Chapter 4, Open Problem 17. Since t cts x £l (see Theorem 20.3]), Theorem 11.41 
gives a partial answer to that problem. 



2 Preliminaries and further equivalences 



In this section we first introduce some more notions of mixing. We will then state some further 
equivalences between them mostly in the reversible case and will prove them in later sections. These 
equivalences will be useful for the proofs of the main results, but are also of independent interest. 
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The following notion of mixing was first introduced by Aldous in [2] in the continuous time case 
and later studied in discrete time by Lovasz and Winkler in [HE]. It is defined as follows: 

istop = maxminjEzfA^] : is a stopping time s.t. P x (X^ x S •) = vr(-)}. (2.1) 

The definition does not make it clear why stopping times achieving the minimum always exist. We 
will recall the construction of such a stopping time in Section [3j 

The mixing time of the lazy chain and the average mixing are related to t s top in the following way. 
Lemma 2.1. There exists a uniform positive constant c\ so that for every reversible Markov chain 

t&vc ^ Cltstop' 

Lemma 2.2. There exists a uniform positive constant C2 so that for every reversible Markov chain 

Utop < C2*L- 

We will prove Lemma I2TT1 in Section O Lemma I2T21 was proved by Aldous in [2], but we include the 
proof in Section U] for completeness. 

In Section H] we will show that for any chain we have the following: 

Lemma 2.3. For every e < l/4 ; there exists a positive constant C3 so that for every Markov chain 
we have that 

th(£) < c 3 t avc (e). 

Definition 2.4. We say that two mixing parameters s and r are equivalent for a class of Markov 
chains M and write s x r, if there exist universal positive constants c and c' so that cs < r < c's 
for every chain in M. 

Proof of Theorem \l-4\ Lemmas 12. 1\ 12.21 and 12.31 give the desired equivalence between i avc and 
t L - □ 

Combining the three lemmas above we get the following: 

Corollary 2.5. For every reversible Markov chain £l and i s top are equivalent. 

Remark 2.6. Aldous in [2] was the first to show the equivalence between the mixing time of a 
continuous time reversible chain and t s t p- 

We will now define the notion of mixing in a geometric time. The idea of using this notion of 
mixing to prove Theorem 11.11 was suggested to us by Oded Schramm (private communication June 
2008). This notion is also of independent interest, because of its properties that we will prove in 
this section. 

For each t, let Zt be a Geometric random variable taking values in {1, 2, . . .} of mean t and success 
probability t . We first define 

d G (t) = max\\F x (X Zt = -)-n\\. 

X 

The geometric mixing is then defined as follows 

t G = t G (l/4) = min{t > : d G {t) < 1/4}. 
We start by establishing the monotonicity property of do(t). 
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Lemma 2.7. The total variation distance do(t) is decreasing as a function oft. 



Before proving this lemma, we note the following standard fact. 

Claim 2.1. Let T and T' be two independent positive random variables, also independent of the 
Markov chain. Then for all x 

\\W X (X T+T , = ■)- ir\\ < \\F X (X T = ■)- 

Proof of Lemma \2. 7\ We first describe a coupling between the two Geometric random variables, 
Zt and Zt+i- Let {JJi)i>i be a sequence of i.i.d. random variables uniform on [0, 1]. We now define 

Z t = min |«>l:C/j<-| and 

Zt+i = min | i > 1 : Ui < 



t + l 



It is easy to see that 



Zt+i — Zt is independent of Zt- 



Indeed, P(Z t+ i = Z t \Z t ) = j^j and similarly for every k > 1 we have F(Z t +i = Z t + k\Z t ) = 
(m) (til) ■ 

We can thus write Z t +\ = {Zt+i — Z t ) + Zt, where the two terms are independent. 

Claim [2TT1 and the independence of Zt+i — Zt and Zt give the desired monotonicity of dc(t). □ 

Lemma 2.8. For all chains we have that 

tG < 4t st op + 1. 

The converse of Lemma 12.81 is true for reversible chains in a more general setting. Namely, let Nt 
be a random variable independent of the Markov chain and of mean t. We define the total variation 
distance dpf(t) in this setting as follows: 

d N {t) =max\\F x (X Nt = -)-tt\\. 

X 

Defining t^ = ^(1/4) = min{t > : d/v(i) < 1/4} we have the following: 
Lemma 2.9. There exists a positive constant C4 such that for all reversible chains 

^stop < c^tv. 

In particular, t stop < c^q. 

We will give the proofs of Lemmas 12.81 and 12.91 in Section 
Combining Corollary 12.51 with Lemmas 12.81 and 12.91 we deduce: 

Theorem 2.10. For a reversible Markov chain tc and ti, are equivalent. 

We end this section by stating and proving a result relating t m ; x and i av0 for any Markov chain. 
First by the triangle inequality it is clear that always i avc < t m ix- For the converse we have the 
following: 
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Proposition 2.11. Let < 5 < 1/2. There exists a positive constant c§ so that if P is a transition 
matrix satisfying P(x, x) > 5, for all x, then 

( 1 
V 

Proof. By the triangle inequality we have that for all x 

\\P\x, ■) - vr|| < ||±P*(x, •) + \P t+ \x, •) - tt|| + \\\P\x, •) - \P t+ \x, -)|| • 

Thus it suffices to show that for all starting points x and all times t there exists a positive constant 
cq such that 

||P*(*,0-P m (^-)ll< * (2.2) 



since t m - lx (e) < C7t m i x for a positive constant C7 and e < \. 

We will now construct a coupling (Xt,Yt+\) of P t (x, •) with P t+1 (:r, •) such that 

Since for all x we have that P(x, x) > 5, we can write 

P = SI + (1-S)Q, 

for a stochastic matrix Q. Let Z be a chain with transition matrix Q. Let iVf and N[ be independent 
and both distributed according to Bin(t, 1 — 5). We are now going to describe the coupling for the 

two chains, X and Y. Let (W s ) s >i and (Wg) s >i be i.i.d. random variables with F(Wx = 0) = 

t t 

1 - p(W'i = 1) = 5. We define iV t = W s and iV t ' = and we set Y t = Z N[ . For t > 1 we 

8=1 S=l 

define 



X, 



z Nt if x t _i / y t , 
y t+1 ifx t _ 1 = y t . 



Then it is easy to check that indeed X and 1" both have the same transition matrix P. We now 
let r = min{i > : X t = Y t+ i}. If W[ = 0, i.e. Y\ = Xq = x, then r = 0. Otherwise, on the event 
W[ = 1, we can bound r by 



r < min < t > : N f 



t+i } 

s=2 J 



We thus see that r is stochastically dominated by the first time that Nt — X)s=2 hits 1. But 
Nt — J^s=2 i s a symmetric random walk on the real line with transition probabilities p(k, k + 1) = 
p(k, k — 1) = 5(1 — 5) for all k. By time t this random walk has moved L number of times, where 

L ~ Bin(t, 25(1 - 5)). 

By the Chernoff bound for Binomial random variables we get that 

L<f\ < e~ t5 /\ (2.3) 
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Therefore we have that 



P(r > t) < P (l < |) + P (r > t,L > |) < e^/ 8 + P (t, > |) , 

where T\ denotes the first hitting time of 1 for a simple random walk on Z. By a classical result 
for simple random walks on Z (see for instance [U Theorem 2.17]) 

ts\ 12V2 

Ti>-)< 



and this concludes the proof. □ 

Remark 2.12. We note that the upper bound given in Proposition 12. 1 11 is tight, in the sense that 
both i avc and \ can be attained. Indeed, for lazy chains t m ; x and t ave are equivalent. This follows 
from the observation above that i avc < t m i x and Proposition 5.6]. For 5 < 1/2, consider the 

following transition matrix f ^ ^ ^ ^ ^ ^ . It is easy to see that in this case the mixing time is 

of order y. 



3 Stopping times and a bound for t 



avc 



In this section we will first give the construction of a stopping time T that achieves stationarity, 
i.e. for all x, y we have that P x (Xt = y) = vr(y), and also for a fixed x attains the minimum in the 
definition of i s t op in (|2.ip . i.e. 

K X [T] = mm{K x [A x ] : A x is a stopping time s.t. F x (X Ax G •) = *•(•)}■ (3-1) 

The stopping time that we will construct is called the filling rule and it was first discussed in [3]. 
This construction can also be found in [lj Chapter 9], but we include it here for completeness. 

First for any stopping time S and any starting distribution \x one can define a sequence of vectors 
O x (t)=F tl (X t = x,S>t), a x (t)=F^X t = x,S = t). (3.2) 

These vectors clearly satisfy 

< <r(t) < 8{t), (6(t) - <r(t))P = 9(t + 1) Vt; 9 = fi. (3.3) 

We can also do the converse, namely given vectors (9(t), cr(t); t > 0) satisfying ()3.3p we can construct 
a stopping time S satisfying (|3.2p • We want to define S so that 



P(5 = t\S>t-l,X t = x, = . . . ,X = x ) = ^|y- (3.4) 

Formally we define the random variable S as follows: Let (£/j)j>o be a sequence of independent 
random variables uniform on [0, 1]. We now define S via 
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From this definition it is clear that (|3.4p is satisfied and that S is a stopping time with respect to an 
enlarged filtration containing also the random variables (Ui)i>o, namely JF S = ct(Xq, Uq, . . . , X s , U s ). 
Also, equations (I3.2p are satisfied. Indeed, setting x t = x we have 



»{X t = x ,S>t)= Yl ^o) II f 1 - J*Jfi) P(xk,xk+i) = O x it), 



xo,xi,...,x t -i k=0 



since 6>o(y) = for all y and also 6(t + 1) = — <r(i))P so cancelations happen. Similarly we 
get the other equality of (|3.2p . 



We are now ready to give the construction of the filling rule T. Before defining it formally, we 
give the intuition behind it. Every state x has a quota which is equal to tt(x). Starting from an 
initial distribution fi we want to calculate inductively the probability that we have stopped so far 
at each state. When we reach a new state, we decide to stop there if doing so does not increase 
the probability of stopping at that state above the quota. Otherwise we stop there with the right 
probability to exactly fill the quota and we continue with the complementary probability. 

We will now give the rigorous construction by defining the sequence of vectors (6(t),a(t);t > 0) 
for any starting distribution p. If we start from x, then simply p = 5 X . First we set 8(0) = p. We 
now introduce another sequence of vectors (E(t);t > —1). Let 1) = for all x. We define 
inductively 



6 x (t), ifH x (t-l) + O a .(t)<ir(x); 

ir(x) — Yi x (t — 1), otherwise. 



Then we let T, x (t) = J2 s <t a ^( s ) ano - d e n ne 6(t + 1) via ()3.3p . Then a will satisfy (I3.2p and 
^x(t) = Fu,(Xt = x,T < t). Also note from the description above it follows that Tl x (t) < vr(x), for 
all x and all t. Thus we get that 

F^Xt = x)= lim £ x (i) < tt(x) 

t— >oo 

and since both F^Xt = ■) and 7r(-) are probability distributions, we get that they must be equal. 
Hence the above construction yielded a stationary stopping time. It only remains to prove the 
mean-optimality (I3.ip . Before doing so we give a definition. 



Definition 3.1. Let S be a stopping time. A state z is called a halting state for the stopping 
time if S < T z a.s. where T z is the first hitting time of state z. 

We will now show that the filling rule has a halting state and then the following theorem gives the 
mean-optimality. 

Theorem 3.2 (Lovasz and Winkler). Let [i and p be two distributions. Let S be a stopping time 
such that F^Xs = x) = p(x) for all x. Then S is mean optimal in the sense that 

E /t [5] = minjE^fC/] : U is a stopping time s.t. F^Xu S •) = p(-)} 

if and only if it has a halting state. 

Now we will prove that there exists z such that T <T Z a.s. For each x we define 

t x = min{£ : T, x (t) = 7r(x)} < oo. 
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Take z such that t z = maxi^ < oo. We will show that T < T z a.s. If there exists a t such that 

X 

P M (T > t,T z = t) > 0, then T, x (t) = vr(x), for all x, since the state z is the last one to be filled. So 
if the above probability is positive, then we get that 

P„(T<i) = X) E *(*) = 1 > 

X 

which is a contradiction. Hence, we obtain that P^(T > t, T z = t) = and thus by summing over 
all t we deduce that P M (T < T z ) = 1. 



Proof of Theorem \3.2\ We define the exit frequencies for S via v x = E, 



all x. 

Since P^Xs 



■5-1 
.fe=0 



for 



E, 



p(-), we can write 

S 



We also have that 



,fe=o 



E, 



E 1 ^ 



3 



S-l 



^l(X k = x) 



.k=0 



+ p{x) = V x + p(x). 



k=0 



fi(x) + E p 



Since 5 is a stopping time, it is easy to see that 

S 



E, 



k=l 



^v y P{y,x). 



Hence we get that 



Let T be another stopping time with P^(Xr 
they would satisfy (|3.5p . i.e. 



v x + P(x) = n(x) +^2u y P(y,x). (3.5) 

y 

) = p(-) and let v' x be its exit frequencies. Then 
v' x + pip) = p(x) + E v' y p (y, x). 



Thus if we set d = v' — i>, then d as a vector satisfies 

d = dP, 

and hence tf must be a multiple of the stationary distribution, i.e. for a constant a we have that 
d = air. 

Suppose first that 5 has a halting state, i.e. there exists a state z such that v z = 0. Therefore we 
get that v' z = air(z), and hence a > 0. Thus v x > ^ for all x and 



E M [T] = E^)>E^ = w 



and hence proving mean-optimality. 

We will now show the converse, namely that if S is mean-optimal then it should have a halting 
state. The filling rule was proved to have a halting state and thus is mean-optimal. Hence using 
the same argument as above we get that S is mean optimal if and only if mini^ = 0, which is the 

X 

definition of a halting state. □ 
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Before giving the proof of Lemma 12. II we state and prove a preliminary result. 

Lemma 3.3. Let X be a reversible Markov chain on the state space T and let L, U be positive 
constants. Let T be a stopping time that achieves stationarity starting from x, i.e. ¥ x (Xt = y) = 
7r(y), for all y. For all y and all times u we define f y (u) = ^F X (X U = y,T < L) + ljF x (X u+ \ = 
y,T < L). Then there exists u < L + U such that 

fv(u) 2 L 

V < H • 

Y <v) ~ u 

Proof. In this proof we will write P xy {t) = P t (x,y) for notational convenience. We define a 
measure u onT x [0,L] by 

u(-,-)=¥ x (T<L,(X T ,T) e (-,.)). 

We define g y {u) = %P X (X L+U = y,T < L) + ±F X (X L+U+1 = y, T < L) for < u < U - 1. By 
conditioning on (Xt,T) we get 

9yi u ) = 7}^2( p z, y (L + u - s) + P ZtV (L + u + 1 - s))v(z, s), 

(z,s) 

where the sum is over (z,s) in V x [0, L\. Thus 

4 <yy X 9y{uf = L 1 + L 2 + L 3 + J 4 , (3.6) 



where 



h = ^ XT 71 " 1 (y) p zi,y(L + u- si)P Z2: y(L + u- s 2 )is(zi,si)v(z2,s 2 ), 



(zi,si) y 



h= ^ XT 71 " 1 {y)Pz 1 , y (L + u - si)P Z2: y(L + u + 1 - s 2 )v(z 1 ,si)v(z2,s 2 ), 



(zi,si) y 

(Z2,S2) 



h= ^ X] 71 " 1 (y) P zi,y( L + u + 1 - si)P Z2<y (L + u- s 2 )v(z 1 ,si)v(z 2 ,s 2 ) and 



(zi,si) y 

(Z2,S 2 ) 

h = S ^' K ~ l {y)Pz 1 ,y{L + u + 1 - si)P Z2<y {L + u + 1 - s 2 )u(zi,s 1 )u(z 2 ,s 2 ). 

{zi,si) y 

(Z2,S2) 

By reversibility we have that 

h= y~] ■n-(z 2 )~ 1 P Zl , Z2 (2L + 2u - si - s 2 )u(zi,s 1 )u(z 2 ,s 2 ), 

(zi,Sl) 
(Z2,S 2 ) 

12 = ^ vr(z2) _1 P 2l , 22 (2L + 2u + 1 - si - s 2 )f(zi, si)i/(z 2 , s 2 ), 

(Z2,S2) 

1 3 = ^ 7r(z 2 )" 1 f , zl , Z2 (2^ + 2u + l-si -s 2 )^(^i,si)^(z 2 ,S2) and 

(«l>Sl) 
(-22,^2) 

7 4 = ^ tt(z 2 )~ 1 P ZuZ2 (2L + 2u + 2- Si- s 2 )v(z!,si)v(z 2 ,s 2 ). 

(zi,si) 
(z 2 ,s 2 ) 
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By considering two cases depending on whether s\ + S2 is odd or even it is elementary to check 
that 

- U-l . 2L+2U-1 

^X]( jP2 i^( 2L + 2n - Sl - S2 ) + p ^^( 2L + 2u + 1 - Sl - S2 )) ^ [7 E p «,*a( u ) s 

u=0 u=0 

since si,S2 G [0, L]. Similarly 

. 17-1 j 2L+21/ 

-^( j P Zl , Z2 (2L + 2n + 2-s 1 -s 2 ) + P Zl>Z2 (2L + 2n + l-s 1 -s 2 )) < - E P^u). 

u=0 u=l 

In this last average we have no dependence on s\, s 2 - Hence using (|3.6p . the fact that u(z, [0, L]) < 
7r(z) for all z and stationarity of 7r, we get that 



17—1 



/2L+2C/-1 



2L+2J7-1 



w=0 J/ 



V w=0 



u=l 



21, Z2 

= 1 + L/C/. 

This is an upper bound for the average, hence there exists some u < U — 1 such that 

Y,K-\y)g y {u) 2 <1 + L/U. 



□ 

Remark 3.4. We note that the above lemma uses the same approach as in Aldous [21 Lemma 38]. 
Aldous' proof is carried out in continuous time. The proof of Lemma [3.31 cannot be done in discrete 
time for the non lazy version of the chain, since in this case defining f y (u) = ¥ X (X U = y,T < L), 

we would get that y _ r ^ < 2 + — . This is where the averaging plays a crucial role. 



U 



We now have all the ingredients needed to give the proof of Lemma 12.11 



Proof of Lemma \2.1\ We fix x. Let T be the filling rule as defined at the beginning of this 
section, which was shown to achieve the minimum appearing in the definition of i s top- Thus, since 
in the definition of t s t p there is a maximum over the starting points, we have that 

K X [T] < f stop . (3.7) 

Let f y (u) = ^F X (X U = y,T < L) + ^¥ x {X u+ \ = y,T < L) as appears in Lemma [3.31 where L 
and U are two positive constants whose precise value will be determined later in the proof and 
u < L + U is such that 



E 



vr(y) " + [T 



(3.t 



We then have 



7T 



< 



ME \p U {x,y) + \p u+ \x,y)-f y {u) 
V y 
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since f y (u) < \P u (x,y) + \P u+1 (x,y) and E y /» = < L). 

By the Cauchy-Schwarz inequality we deduce that 



1/2 



/ w (u) - ?r(y) 



7r(y)V2 



<E^rUw-^)) 5 



E *(*/rV„(«) 2 - 2 e /» + 1 = E <y)~ l fy - 2P -( r < ^) + 1- 



Using (13.81) we get that this last expression is bounded from above by 

2P X (T>L) + -. 

Since ||4P*(x, •) + \P t+1 {x, •) — %\\ is decreasing in t, we conclude that 



1 / / L^ 1/2 ' 

< - ¥ X (T >L)+ ( 2P X .(T > L) + - 



If we now take L = 20t s t O p and U = 10L, then by Markov's inequality and f]3.7[) we get that the 
total variation distance 

1 



< 



Thus we get that t avc < L + U = 220i s t O p and this concludes the proof of the lemma. 



□ 



4 Proofs of equivalences 

In Section [2] we defined the notion of t s top- In order to prove Lemma 12.21 we will first show a 
preliminary result that compares t stop to t^ top , where the latter is defined as 

^stop = maxmin{E a .[C/ x ] : U x is a stopping time s.t. F x (X(j x S •) = 7r(-)}, 

where X L stands for the lazy version of the chain X. 
Lemma 4.1. For every chain we have that 

1 L 

^stop < 2^ sto P' 

Proof. Let X L denote the lazy version of the chain X. Then X L can be realized by viewing X 

at a Bin(t, 1/2) time, namely let f(t) ~ Bin(t, 1/2), then X^ = Xfu\ a.s. We can express f(t) 
t 

as f{t) = where (£(j))j>o are i.i.d. fair coin tosses. Let T be a stopping time for the 

j=0 

lazy chain X L . We enlarge the filtration by adding all the coin tosses. In particular for each k we 
consider the following filtration: 

Jfc = a(X , . . . ,X k , (£j)j>o)- 
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It is obvious that X has the Markov property with respect to the filtration T too. Also f(T) is a 
stopping time for that filtration. Indeed, 



{/CO = t} 



and for each £ > t we have that 



i=o I *>t I j=0 



i=o 

since on the event /(£) = t we have that = Xjm = X t . Hence f(T) is a stopping time for X 
and it achieves stationarity, since for all x and y 

F x {X f{T) = y) = P X (X% =y) = vr(y), 

since T achieves stationarity for the lazy chain. By Wald's identity for stopping times we get that 
for all x 

" T 



E x [f(T)]=E x 



E X [T]E X [£\ = \e x [T]. 



Hence using a stopping time of the lazy chain X achieving stationarity we defined a stopping time 
for the base chain X achieving stationarity and with expectation equal to half of the original one. 
Thus for all x we obtain that 

{E X [T] : T stopping time s.t. F x (x£ = •) = vr} 
C {2E X [T'] : T' stopping time s.t. (X T , = •) = 



Therefore taking the minimum concludes the proof. 



□ 



Before giving the proof of Lemma 12.21 we introduce some notation and a preliminary result that 
will also be used in the proof of Lemma 12.91 For any t we let 



s(t) = max 



1 



tt(2/) 



and d(t) = max \\P t (x, •) — P l (y, 



We will call s the total separation distance from stationarity. 
We finally define the separation mixing as follows 

t sep = min{t > : s(t) < 3/4}. 

Lemma 4.2. For a reversible Markov chain we have that 

d(t) < d(t) < 2d(t) and s(2t) < 1 - (1 - d(t)) 2 . 

Proof. A proof of this result can be found in [TJ Chapter 4, Lemma 7] or [5J Lemma 4.11 and 
Lemma 19.3]. □ 



Remark 4.3. Lemma 14.21 above gives that t sep < 2t mix . 
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Lemma 4.4. There exists a positive constant c so that for all chains we have 

^stop ^ ct se p 

Proof. Fix t = t scp . Then we have that for all x,y 

P\x,y) > (l-3/4)7r(y) = \n{y). 



Hence, we can write 



1 3 

P\x,y) = -7r(y) + -^(y), 



where for a fixed a; we have that v x is a probability measure. We can now construct a stopping 
time S £ {i, 2i, . . .} so that for all x 

P^XsG ■ ) 5 = t) = i 7 r(-) 

and by induction on m such that 

F x (X s G-,S = mt)=^j jTr(-)- 
Therefore it is clear that is distributed according to 7r and E^fS 1 ] = At. Hence we get that 

^stop — 4t se p. n 

Proof of Lemma \2.2\ Let t^ ep stand for the separation mixing of the lazy chain. Then Lemma l4.4l 
gives that 

t L < rt L 

''stop — '-'"'sep 1 

Finally, Lemma 14.11 and Remark 14.31 conclude the proof. □ 

Proof of Lemma \2.3[ Fix t. Let T be a random variable taking values t and t + 1 each with 
probability 1/2, i.e. 



T 



t, w.p.i 

t+1, W.p.i. 



Thus T can be written as T = Yy + 1, where Y\ is Bernoulli with probability \. Then we have that 
for all x and y 

Wx(X T = y) = ^P x {X t = y) + ^(Xt+i = y). 

Let Z ~ Bin(3t, §). Then we can write Z as Z = Y\ + Zi, where Z\ is distributed according to 
Bin(3t — 1, ^) and is independent of Y\. Therefore Z can be expressed as the sum of two independent 
random variables, Z = T + (Z\ — t). (With high probability Z\ — t ~ [Z\ — t) + .) We fix x. By the 
triangle inequality for the total variation distance, we obtain 

\\Px(X Z = ■)- vr|| < \\F x (X T+(Zl _ t)+ = ■)- tt\\ + \\F x (X T+{Zl _ t) = •) - F x (X T+{Zl _ t)+ = -)\\. 

Since T and [Z\ — i)+ are independent and {Z\ — i)+ > 0, by the monotonicity of the total variation 
distance Claim 12. H we deduce that 

\\F x (X T+(Zl _ t)+ = ■)- vr|| < \\F X (X T = •) - tt||. (4.1) 
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It is easy to see that 

||P as (X r+(Zl _ t) = •) - ^ x (X T+{Zl _ t)+ = -)|| < P x (Zi < i) < e~ ct , (4.2) 

for a positive constant c, since Zi follows the Binomial distribution. Hence by (|4.ip and (|4.2p we 
get that 

\\W X (X Z = ■)- n\\ < \\F X (X T = ■)- Ti-ll + e~ ct (4.3) 
The mixing time for the lazy chain was defined in (II. ip . Equivalently it is given by 



t\,(e) = min it : max ||Pj(X^/ = •) — 7r|| < e \ , 
where Z' is distributed according to Bin(t, 1/2). Thus 

tj,(e) = 3 min < t : max ||Pj(X^ = •) — 7r|| < e 



Finally, from (|4.3p we get that there exists a constant C2 > such that 

*L < c 2 t avc (e). 

But > £3^(^)5 since e < | and this concludes the proof. □ 

5 Mixing at a geometric time 

Before giving the proof of Lemma 12.81 we state two easy facts about total variation distance. 
Claim 5.1. Let Y be a discrete random variable with values in N and satisfying 
P(Y" = j) < c, for all j > and ¥(Y = j) is decreasing in j, 
where c is a positive constant. Let Z be an independent random variable with values in N. Then 

\\F(Y + Z = -)-F{Y = -)\\<cE[Z]. (5.1) 

Proof. Using the definition of total variation distance and the assumption on Y we have for all 

ken 

||P(y + fc = .)_P(y = .)||= (F(Y = j) - F(Y + k = j)) < kc. 

j:¥(Y=j)>¥(Y+k=j) 

Finally, since Z is independent of Y, we obtain (15. ip . □ 
The coupling definition of total variation distance gives the following: 

Claim 5.2. Let X be a Markov chain and W and V be two random variables with values in N. 
Then 

\\f(x w = ■)- f(x v = -)||< \\nw = ■)- F(V = -)||. 
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Proof of Lemma \2.8\ We fix x. Let r be a stationary time, i.e. F X (X T = •) = n. Then r + s is 
also a stationary time for all s > 1. Hence, if Zf is a Geometric random variable independent of r, 
then Z t + T \s also a stationary time, i.e. F x {Xz t + T = •) = vr. Since Z t and r are independent, and 
Zt satisfies the assumptions of Claim [BTTj we get 

\\F x {Z t +r = -)-F x {Z t = .)\\ < (5-2) 

From Claim [5T2l we obtain 

\\p x (x Zt+T = ■)- p x (x Zt = •)!!< ||P*(z t + r = •) - Px-(z 4 = •)!!< 



and since F x (Xz t +T — ') — ^T; taking t ^ 4E ;e [t] concludes the proof. D 

Recall from Section [2] the definition of Nt as a random variable independent of the Markov chain 
and of mean t. We also defined 

d N (t) =max||P a; (X 7Vt = -) — tt||. 

X 

Let iV t , iVj be i.i.d. random variables distributed as Nt and set Vj = N$ + JV t . We now define 



sjv(i) = max 



1 _ P,(X yi = y) 



and d N (t) = max HP^X^ = •) - F y {X Nt 



When JV is a geometric random variable we will write doit) and dc(t) respectively. 
Lemma 5.1. For all t we have that 

d N (t) < d N (t) < 2d N (t) and s N (t) < 1 - (1 - d N (t)) 2 . 

Proof. Fix t and consider the chain Y with transition matrix Q(x,y) = P x (XN t = y). Then 
Q 2 (x,y) = P x (Xv t = y), where Vt is as defined above. Thus, if we let 



sy(u) = max 

x,y 



Q u (x,y) 

7r(y) 



and d Y (u) = max \\F X (Y U = •) - ¥ y (Y u = -)||, 



then we get that sjv(i) = •sy(2) and djv(i) = dy(l). Hence, the lemma follows from Lemma [4.2i □ 
We now define 

t SfN = mm{t > : s^(t) < |} . 
Lemma 5.2. There exists a positive constant c so that for every chain 

^stop < ct S; jv- 

Proof. Fix i = £ S) jv- Consider the chain Y with transition kernel Q(x,y) = F x (Xy t = y), where Vt 
is as defined above. 

By the definition of sj\i(t) we have that for all x and y 

Q(x,y) > (1 - s N (t))Tr(y) > -Tr(y). 
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Thus, in the same way as in the proof of Lemma [4.41 we can construct a stopping time S such that 
Ys is distributed according to tt and E^fS 1 ] = 4 for all x. 

Let V* , vS , ... be i.i.d. random variables distributed as Vt- Then we can write Y u = X ji) j u ) ■ 

v t +...+v t 

If we let T = + . . . + V t , then T is a stopping time for X such that £(Xt) = tt and by Wald's 
identity for stopping times we get that for all x 

E X [T] = E x [S]E[V t ] = St. 

Therefore we proved that 

^stop < 8t Si AT. 

□ 



Proof of Lemma \2.9\ From Lemma 15.11 we get that 

t s ,N < 2%. 

Finally Lemma 15.21 completes the proof. □ 

Remark 5.3. Let iVj be a uniform random variable in {1, . . . , t} independent of the Markov chain. 
The mixing time associated to Nt is called Cesaro mixing and it has been analyzed by Lovasz and 
Winkler in [7]. From [5l Theorem 6.15] and the lemmas above we get the equivalence between the 
Cesaro mixing and the mixing of the lazy chain in the reversible case. In Section [7] we show that 
the Cesaro mixing time is equivalent to tQ for all chains. 

Remark 5.4. From the remark above we see that the mixing at a geometric time and the Cesaro 
mixing are equivalent for a reversible chain. The mixing at a geometric time though has the 
advantage that its total variation distance, namely d G (t), has the monotonicity property Lemma [2.7l 
which is not true for the corresponding total variation distance for the Cesaro mixing. 

Recall that d(t) = max\\¥ x (Xt = ■) — ¥ v (Xt = Oil is submultiplicative as a function of t (see for 

x,y 

instance [SJ Lemma 4.12]). In the following lemma and corollary, which will be used in the proof 
of Theorem II. 1\ we show that do satisfies some sort of submultiplicativity. 

Lemma 5.5. Let /3 < 1 and let t be such that da(t) < (3. Then for all k £ N we have that 

d G (2 k t)< ^±^ k da(t). 

Proof. As in the proof of Lemma 12.71 we can write Z2t = {Z^t — Zt) + %t) where Z2t — Z% and Z% 
are independent. Hence it is easy to show (similar to the case for deterministic times) that 

d G (2t) < d G {t) max \\P x (X Z2t _ Zt = •) - P y (X Z2t . Zt = -)||. (5.3) 

x,y 

By the coupling of Z2t and Zt it is easy to see that Z<it — Zt can be expressed as follows: 

Z 2t -Zt = {l-0 + iG 2 u 

where £ is a Bernoulli^ ) random variable and G 2 t is a Geometric random variable of mean It 
independent of £. By the triangle inequality we get that 

\\F x (Xz 2t -z t = ■)- V y (X Z2t _ Zt = -)|| < \ + \W x {X G2t = ■)- F y (X G2t = OH = \ + \d G (2t), 
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and hence (|5.3p becomes 

d G (2t) < d G (t) Q + \d G {2t) \ < lda(t) (1 + d G {t)) , 

where for the second inequality we used the monotonicity property of d G (same proof as for d G (t)). 
Thus, since t satisfies d G (t) < f3, we get that 

d G (2t) < (±±?\ d G (t), 

and hence iterating we deduce the desired inequality. □ 

Combining Lemma 15.51 with Lemma 15.11 we get the following: 
Corollary 5.6. // t is such that d G (t) < (3, then for all k we have that 

d G (2 k t)<2^±^y d G (t). 

Also if d G {t) < a < 1/2, then there exists a constant c = c{a) depending only on a, such that 
d G {ct) < 1/4. 

6 Hitting large sets 

In this section we are going to give the proof of Theorem II. li We first prove an equivalence that 
does not require reversibility. 

Theorem 6.1. Let a < 1/2. For every chain tQ x tn(a). (The implied constants depend on a.) 

Proof. We will first show that tQ > ct}i(a). By Corollary 15.61 there exists k = k(a) so that 
d G (2 k tc) < Tf. Let t = 2 k tQ. Then for any starting point x we have that 

F x (X Zt eA)> tt(A) - a/2 > a/2. 

Thus by performing independent experiments, we deduce that ta is stochastically dominated by 
YliLiGii where N is a Geometric random variable of success probability a/2 and the GVs are 
independent Geometric random variables of success probability \. Therefore for any starting point 
x we get that 

IE* [ta] < -t, 
a 

and hence this gives that 

2 , 

max E x [ta] < -2 t G . 

x,A:ir{A)>a a 

In order to show the other direction, let t' < tQ. Then d G {t') > 1/4. For a given a < 1/2, we fix 

7 G (a, 1/2). From Corollary 15.61 we have that there exists a constant c = 0(7) such that 

d G (ct') > 7. 

Set t = ct' . Then there exists a set A and a starting point x such that 

tt(A) - P x (X Zt e A) > 7, 
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and hence it {A) > 7, or equivalently 

F x (X Zt eA)<7r(A)- 7. 

We now define a set B as follows: 

B = {y : P y (X Zt G A) > ir(A) - a}, 
where c is a constant smaller than a. Since ir is a stationary distribution, we have that 
tt(A) = ^ P y (X Z( G A)vr(y) + ^ F y (X Zt G A)tt(j/) < tt(S) + tt(A) - a, 

and hence rearranging, we get that 

7r(B) > a. 

We will now show that for a constant 9 to be determined later we have that 

maxE z [t b ] > 9t. (6.1) 

z 

We will show that for a to be specified later, assuming 

maxE 2 [r B ] < Ot (6.2) 

z 

will yield a contradiction. 

By Markov's inequality, (|6.2|) implies that 

lPx(rB > 29t) < i (6.3) 
For any positive integer M we have that 

F x (r B > 2M0t) = F x (t b > 2M9t\r B > 2(M - l)(9t)P x (TB > 2(M - l)(9t), 
and hence iterating we get that 

F x (t b > 2M9t) < ^j. (6.4) 

By the memoryless property of the Geometric distribution and the strong Markov property applied 
at the stopping time tb, we get that 

F x {X Zt eA)> F x {t b < 29Mt,Z t > r B ,X Zi G A) 

> F x (t b < 29Mt, Z t > T B )F x {X Zt G A\t b < 29Mt, Z t > r B ) 

> F x (t b < 29Mt)F x (Z t > 29Mt) ( inf F w (X Zt G A)) . 

\weB J 



But since Zt is a Geometric random variable, we obtain that 

F x (Z t > 29Mt) =il--\ 

which for 29Mt > 1 gives that 

F x (Z t > 29 Mt) > 1 - 29 M. (6.5) 
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( (f6T2|) implies that 9t > 1, so certainly 20 Aft > 1.) 

We now set 9 — 2M2 M ' Using ()6.3|) and (|6.5p we deduce that 

V x {X Zt e A) > (1 - 2~ M f (tt(A) - a). 

Since 7 > a, we can take M large enough so that (l — 2~ A/ ) 2 (ir(A) — a) > ir(A) — 7, and we get 
a contradiction to (16.21). 



Thus (|6.ip holds; since 7r(-B) > a, this completes the proof. □ 

Proof of Theorem Combining Theorem 12.101 with Theorem 16.11 gives the result in the re- 
versible case. □ 



7 Equivalence between Cesaro mixing and tc 

In this section we will show that the notion of mixing at a geometric time defined in Section [2] and 
the Cesaro mixing used by Lovasz and Winkler [7] are equivalent for all chains. First, let us recall 
the definition of Cesaro mixing. Let Ut be a random variable independent of the chain uniform on 
{l,...,t}. We define 

t Ces = min { t > : max HP^X^ = •) - ir\\ < - 
[ x 4 

Proposition 7.1. For all chains tQ >c tees- 

Proof. For each s, let U s be a uniform random variable in {l,...,s} and Z s an independent 
geometric random variable of mean s. 

We will first show that there exists a positive constant c\ such that 

tees < cit G . (7.1) 

Let t = £g(1/8), then for all x 

||P a ,(^ = -)-T||<^ (7-2) 

From Claims [5TTI and [5?2l we get that 

l|Px(*i*» = •) - ^x(x Ust+Zt = Oil < \K(u 8t = ■)- w x (u 8t + z t = on < ^. 

By the triangle inequality for total variation we deduce 

\\¥ x (X U8t = ■)- vrll < ||P x (^ 8t = " V x (X Ust+Zt = OH + \\¥ x (X U8t+Zt = ■)- tt\\ 
From ([72D and Claim O it follows that 

HP^X^^ = - tt|| < HP^Xx, = - vr|| < 1. 

Hence, we conclude 

\\F x (X Ust = ■) - tt\\ < 1, 
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which gives that tecs < 8t. From Corollary 15.61 we get that there exists a constant c such that 
to (1/8) < etc and this concludes the proof of (|7. 1 1) . 

We will now show that there exists a positive constant C2 such that 

*G < C2iCes- 

Let t = tees, i-e. for all x 

\\V x {X Ut = -)-*\\<± (7.3) 

From Claims [5TT1 and [5T21 we get that 

W x {Xz st =■)- V*(X Ut+ZBt =-)\\< \\¥ x (Z 8t = ■)- F x (Z 8t + U t = -)\\<± 
So, in the same way as in the proof of (|7.ip we obtain 

\K(x Z8t = -)-M\<l- 

Hence, we deduce that £g(3/8) < 8t and from Corollary 15.61 again there exists a positive constant 
d such that £q — c '*g(3/8) and this finishes the proof. □ 



8 A new proof of t W0( ± x ti for reversible chains 

Recall the definition i pro d = max7r(j4)E a .[r^] from Remark 11.21 As noted there, Aldous [2] showed 

x,A 

the equivalence between the mixing time t cts of a continuous time reversible chain and i pro d- Using 
the equivalence >c t cts (see [H Theorem 20.3]) it follows that for a reversible chain t pro d ~ th- 
in this section we give a direct proof. Recall that i pro d > c£l for a reversible chain, where c is a 
positive constant, follows from Theorem ll.il 



We will first state and prove a preliminary lemma, which is a variant of Kac's lemma (see for 
instance [5j Lemma 21.13]). To that end we define for all k and all sets A 

t\ = min{t >l:X t €A} and rjf } = min{t > k : X t £ A}. 
Lemma 8.1. We have that 

XVtoErf] < k. 

x€A 

Proof. Let P be the transition matrix of the reversed chain, i.e. 

P{x,y) = — — . 

7T{X) 

Then for all t > k and xq, . . . , xt in the state space S, we have 



i=l i=l 
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Summing over all xq = x € A, x\ G S, . . . , Xk-\ G S, Xk ^ A, . . . , xt-i ^ A, x% = y £ S we obtain 

J2<x)F x (t^ >t)< Y,*(v)h(*i e (*-M]), (8-1) 

x£A y 



where f\ stands for the first positive entrance time to A for the reversed chain. Summing ([8.1 
over all t we get that 

t 

ETWrf'^EE^tf >t)<££*(v) £ p»(^ = -) 

xeA t xgA t y s=t—k+l 

= £ «■(!/) £ s £ 1 = -) = £ *•(*) £ k h(n = s) = k. 

y s t=s y s 



□ 



Proof of tprod <• c'^l- To simplify notation, let the chain X be lazy and reversible. From Lemma lBTT 
and Markov's inequality it follows that for all k and all sets A 



-(*> > JL 1 < I 
x(A)J ~ 2' 



where 7r|^ stands for the restriction of the stationary measure ir on A. 

Take now k = 2£l- Then using submultiplicativity we get that di(k) < ^l^l) 2 < Let Xq ~ 7r|,4 
and 2 € S 1 . Then 

\\P*(X ,-)-Pt(z,-)\\<± 



We can couple the two chains, X^jX^i, . . . with Xq ~ 7t|a and Y^, Yfc+i, . . . with Yq = z, so that 
they disagree with probability ||P^(Xo, •) — Pi(z, -)||. 



Thus we obtain 



7T{A) J y 7T(A) 



< P(coupling fails) < — , 



and hence using (|8.2p we get that 



(k) . 2k \ 3 



Therefore for all z we have that 

2k \ m ( (k ) 2k \ 3 

i > < P, t\ ' > < -. 

1 -*{A)J- Z \ A -*{A)J-4 

By performing independent experiments we see that ta is stochastically dominated by — — -Geo (§) , 

tt(A) V4/ 

where Geo stands for a Geometric random variable, and hence for all z we get that 



3tt(A) 3ir{A) 

and this finishes the proof. □ 
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9 Application to robustness of mixing 



Theorem 9.1. Let T be a finite tree on n vertices with unit conductances on the edges. Let T 
be a tree on the same set of vertices and edges as T but with conductances on the edges satisfying 
c < c(x, y) < c' , for all edges e = (x, y), where c and d are two positive constants. Then the mixing 
time of the lazy random walk on T and on T are equivalent, i.e. in our notation, £l(T) x t^(T). 

Before proving the theorem, we state and prove two lemmas which will be used in the proof but 
are also of independent interest. 

Lemma 9.2. Let T be a finite tree with edge conductances. For each subset A of vertices and any 
vertex v we have 

max K x [ta] < t v I 1 + 



tt(A). 

where ta stands for the first hitting time of A by a simple random walk on T and t v = max x E x [t v ] . 

Proof. If v G A, then the result is clear, so we assume that v ^ A. 
For all x we have 

Ex[ta] < E x [t v ]+E v [t a ] < t v +E v [r A }. 

Thus it suffices to show that 

E.M < ^. (9.1) 

In order to show that, we are going to look at excursions of the random walk from v. Defining Za 
to be the time that the walk spends in A in an excursion from v, i.e., Za = J2t=i e we 
can write 



E v [z A \z A >oy 



Clearly 



Hence 



E V [Z A ] = -Vf and E V [Z A \Z A > 0] < t v . 

7T[V) 



^ k(A) 1 
\{ta < r+) > 



n(v) t v 



Therefore we get 



E v [t a ]<K 



N 



,i=i 



where N is a geometric random variable of success probability and 4 is the length of the 

i-th excursion from v. By Wald's identity we have 

and this completes the proof. □ 
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We call a node v in T central if each component of T — {v } has stationary probability at most 1/2. 
It is easy to see that central nodes exist. Indeed, for any node u of the tree denote by C(u) the 
component of T — {u} with the largest stationary probability. Now consider the vertex u* that 

achieves min \ir(C(u))\. This is clearly a central node, since if ir(C(u*)) > 1/2, then the neighbour 

u 

w 6 C(u*) of u* would satisfy n(C(w)) < ir(C(u*)), contradicting the choice of u* . 

Lemma 9.3. Let T be a tree on n vertices with conductances on the edges. Then for any central 
node vofT 

tjj x t v , 

where t v = m&x x E x [T v ]. 

Proof. First of all from Lemma I9.2I and Theorem ll.il we obtain that for any central node v 

*L < ctv, (9-2) 

for an absolute constant c. 

To finish the proof of the lemma we have to show that for any central node v 

t L > ctv, (9-3) 

for a positive absolute constant c. 

It is easy to see that E^fr^] = E^frs], for x ^ v, where B is the union of {v} and the components 
of T — {v} that do not contain x.The definition of a central node gives that n(B) > 1/2. Hence, 

tv < *h(1/2). (9.4) 
Inequality (j9.3j) now follows from Theorem I l.li □ 



We now recall a formula from [lj Lemma 1, Chapter 5] for the expected hitting time on trees. 

Lemma 9.4. Let T be a finite tree with edge conductances c(u,v), for all edges (u,v). Let x 
and y be two vertices of T and let {vo = x,v\, . . . ,v n = y} be the unique path joining them. 
Let T x (z) be the union of {z} and the connected component of T — {z} containing x. Writing 
d = E w ,z&T x (v i+1 ) c (w,z), we then have 

Proof of Theorem 1 9. 1\ From Lemma 19.41 and the boundedness of the conductances we get that 
for any two vertices x and v 

E x [t v ] x E x [t v ], 

where r denotes hitting times for the random walk on T. 

Lemma 19.31 then finishes the proof. □ 
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10 Examples and Questions 



We start this section with examples that show that the reversibility assumption in Theorem 11.11 
and Corollary 12.51 is essential. 

Example 10.1. Biased random walk on the cycle. 

Let Z„ = {1, 2, . . . , n} denote the n-cycle and let P(i, i + 1) = | for all 1 < i < n and P(n, 1) = |. 
Also P(i, i — 1) = g, for all 1 < i < n, and P(l, n) = g. Then it is easy to see that the mixing time 
of the lazy random walk is of order n 2 , while the maximum hitting time of large sets is of order 
n. Also, in this case i s t p = 0(n), since for any starting point, the stopping time that chooses a 
random target according to the stationary distribution and waits until it hits it, is stationary and 
has mean of order n. This example demonstrates that for non-reversible chains, in and t s top can 
be much smaller than t^. 

Example 10.2. The greasy ladder. 

Let S = {1, ... , n} and P(i, i + 1) = \ = 1 - P(i, 1) for i = 1, . . . , n - 1 and P(n, 1) = 1. Then it 
is easy to check that 

2 _i 



is the stationary distribution and that £l an d are both of order 1. 

This example was presented in Aldous [2J, who wrote that t stop is of order n. We give an easy proof 
here. Essentially the same example is discussed by Lovasz and Winkler j7] under the name "the 
winning streak" . 

Let TV be the first hitting time of a stationary target, i.e. a target chosen according to the stationary 
distribution. Then starting from 1, this stopping time achieves the minimum in the definition of 

Estop > 1-6- 

Ei[r n ] = min{Ei[A] : A is a stopping time s.t. Pi(A A G •) = vr(-)}. 

Indeed, starting from 1 the stopping time T n has a halting state, which is n, and hence from 
Theorem 13.21 we get the mean optimality. By the random target lemma [1] and [5] we get that 
IEi[ T 7r] = ^i[ r vr]) for all i < n. Since for all i we have that 

K-j [tv] > min{Ej[A] : A is a stopping time s.t. Pj(Xa G •) = 7r(-)}, 

it follows that t stop < Ei[r T ]. But also Ei[r„-] < t stop , and hence t st0 p = Ei[r^]. By straightforward 
calculations, we get that Ei[Tj] = 2 l (l — 2 _n ), for all i > 2, and hence 

tstop = Ei [7V] = 2 4 (1 " 2 ~ n )j^2^ = H ~ 1 - 

i=2 

This example shows that for a non-reversible chain t s t p can be much bigger than th or tn- 

Question 10.3. The equivalence tn(a) x in Theorem 11.11 is not valid for a > |, since for two 
n- vertex complete graphs with a single edge connecting them, ti, is of order n 2 and tn(a) is at most 
n for any a > 1/2. Does the equivalence tn (1/2) x tj, hold for all reversible chains? 
(Note that (|9.2p and (|9.4p show that the answer is positive for random walks on trees with arbitrary 
edge conductances.) 
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